Skip to content

Created generic ServersView object#6278

Draft
dlmarion wants to merge 12 commits intoapache:mainfrom
dlmarion:monitor-columns-compute-serverside
Draft

Created generic ServersView object#6278
dlmarion wants to merge 12 commits intoapache:mainfrom
dlmarion:monitor-columns-compute-serverside

Conversation

@dlmarion
Copy link
Copy Markdown
Contributor

Replaced ScanServerView with a generic ServersView object
which contains the column definitions and the data for each
server.

@dlmarion dlmarion added this to the 4.0.0 milestone Mar 30, 2026
@dlmarion dlmarion requested a review from DomGarguilo March 30, 2026 21:13
@dlmarion dlmarion self-assigned this Mar 30, 2026
@DomGarguilo
Copy link
Copy Markdown
Member

I think there are a lot of things to consider here. I like the idea of trying to make things more general and reusable but it also adds complexity

  • I think the addition of the new MetricColumnMappings adds a lot of maintenance since we will need to keep it in sync with Metric.java. There is a lot of overlap between those two that could drift over time
  • The final shape of the table is harder to reason about now. Previously we just defined the columns and how they would be displayed on the front end and made sure the DTO matched that contract. Now things are spread across several places like the new ServersView object, MetricColumnMappings and the front end uiClass renderers.
  • One thing I like about the current DTO-per-table/page approach is that it makes the contract very explicit. It is easier to reason about what data is being sent, which columns are shown, and in what order they will be rendered.
  • There is still page-specific choice that is needed even with this refactor and this approach seems to just move that complexity rather than remove it

There may be some middle ground here that allows us to share commonalities between server pages but im not sure what that is yet

@DomGarguilo
Copy link
Copy Markdown
Member

A middle ground might be to keep the curent per-page DTO but extract a small shared server identity/common-fields object (server type, resource group, address, last contact) then each page can use that shared data to create its own explicit row DTO.

On the front end, we could do something similar where we keep the columns explicitly defined but reuse small helpers for the common columns. That gives us real reuse where the overlap exists without making the table shape harder to reason about

@dlmarion
Copy link
Copy Markdown
Contributor Author

dlmarion commented Apr 1, 2026

I think the addition of the new MetricColumnMappings adds a lot of maintenance since we will need to keep it in sync with Metric.java. There is a lot of overlap between those two that could drift over time

True. I initially had the column mapping information in the Metric.java class, but figured someone might object to Monitor related code being there. It doesn't matter to me where it lives.

The final shape of the table is harder to reason about now.

One of the issues I was trying to solve with this PR is that the current pages are mainly a translation of the pages from 2.1 and 4.0 added many more metrics that we can't visualize during the development cycle to determine which columns should be included in the tables.

One thing I like about the current DTO-per-table/page approach is that it makes the contract very explicit. It is easier to reason about what data is being sent, which columns are shown, and in what order they will be rendered.

I don't have an issue with the current DTO approach, except that it's currently limiting what is being shown in the table. If we added all of the metrics to the current DTOs so that we could determine which columns to include, that would be another approach.


Additionally, I'll add that currently the changes in this PR only affect the Scan Server page, but I think it could be a general way to present the metrics for all of the server pages under the Servers menu in the NavBar.

There are a couple of problems with the current approach:

  1. When a metric is added or removed from a server process, the following items probably also need to be modified:
    a. The SystemInformation or Endpoints class to update the DTO
    b. The page template file (*.ftl) to account for table header changes
    c. The javascript file (*.js) to account for the DataTable changes.
  2. From a monitor development perspective, the current pages don't show us all of the information that we could be displaying. The current pages are a translation of the pages from 2.1, and many more metrics have been added in 4.0 that are not being shown.
  3. The DTO provides for some efficiency between the browser and the Monitor server, but the current code doesn't do anything yet to limit which Metrics are being returned to the Monitor.

This approach allows us to display all of the metrics during development so that we can determine which ones make sense (I noticed immediately that the tablet recovery metrics don't make sense for scan servers). The Monitor calls AbstractServer.getMetrics on each server process, so we can tailor that method to include/exclude the metrics that we need to send to the Monitor and the server process tables on the Monitor will adapt automatically.

@ctubbsii
Copy link
Copy Markdown
Member

ctubbsii commented Apr 2, 2026

  • I think the addition of the new MetricColumnMappings adds a lot of maintenance since we will need to keep it in sync with Metric.java. There is a lot of overlap between those two that could drift over time

I think some overlap could be removed by pushing the description and type into Metric.java, though in Metric.java, we may need to add a "summary", because the existing descriptions may be too detailed for this purpose (they exist for the generated docs), and the type will probably be an enum that's translated into the specific "big-num, etc." type keywords used by DataTables.

  • One thing I like about the current DTO-per-table/page approach is that it makes the contract very explicit. It is easier to reason about what data is being sent, which columns are shown, and in what order they will be rendered.
  • There is still page-specific choice that is needed even with this refactor and this approach seems to just move that complexity rather than remove it

Can we just do a pattern of performing a per-table transform/filter before rending the table contents? Will that be enough to make local page-specific or table-specific choices and still be able to reuse the generic code for rending the table from the transformed contents?

@ctubbsii
Copy link
Copy Markdown
Member

ctubbsii commented Apr 2, 2026

  1. When a metric is added or removed from a server process, the following items probably also need to be modified:
    a. The SystemInformation or Endpoints class to update the DTO
    b. The page template file (.ftl) to account for table header changes
    c. The javascript file (
    .js) to account for the DataTable changes.

Not having to modify the DataTables JS code if we add/remove a column is definitely a convenience in this approach, but we still have to make the decision to include the new metric in the subset of those shown on the page somewhere. In this approach, it's in MetricColumnMappings.java. Moving it to code instead of the template/javascript is convenient if we don't have any changes in presentation, and just want text, but having the decision in the template/js/css allows us to do more creative things like show an indicator light icon, a graphical progress bar, or other things. So, what we gain in convenience takes away some creative flexibility. Some of that flexibility can probably be gained back by doing local transformations of the data, for presentation.

  1. From a monitor development perspective, the current pages don't show us all of the information that we could be displaying. The current pages are a translation of the pages from 2.1, and many more metrics have been added in 4.0 that are not being shown.

I don't think we actually want most of those to be shown. During development, while we are deciding what is useful to show, it's definitely convenient to see more with less work, but as we work to finalize things, we really should be focused on presentation of the useful things, and avoid turning the monitor into a big tabular display of metrics (a proper metrics collection service should provide that kind of thing instead, whereas the monitor should be focused system deployment status, i.e. "which deployed system components are running and what are they doing right now?").

  1. The DTO provides for some efficiency between the browser and the Monitor server, but the current code doesn't do anything yet to limit which Metrics are being returned to the Monitor.

This approach allows us to display all of the metrics during development so that we can determine which ones make sense (I noticed immediately that the tablet recovery metrics don't make sense for scan servers). The Monitor calls AbstractServer.getMetrics on each server process, so we can tailor that method to include/exclude the metrics that we need to send to the Monitor and the server process tables on the Monitor will adapt automatically.

Is that getMetrics() method only used by the monitor, or is that endpoint also used by other things, like a registered metrics sink? Although those particular metrics make sense to filter out on the ScanServer's override of that method, we should be careful about using this mechanism to tailor the metrics for the monitor because the monitor will probably end up using a much smaller subset of available server metrics than other callers of that method.

@dlmarion
Copy link
Copy Markdown
Contributor Author

dlmarion commented Apr 2, 2026

Is that getMetrics() method only used by the monitor, or is that endpoint also used by other things, like a registered metrics sink?

The Monitor is currently the only thing using that endpoint. The Monitor fetching the metrics from the server processes via this method is what is going to enable us to evenutally remove the ManagerMonitorInfo and related code from the Manager.

@DomGarguilo
Copy link
Copy Markdown
Member

I'm thinking there are at least 3 ways we could structure things generally

  1. Explicit on both sides
    • Backend: define page specific DTO that matched the table exactly
    • Frontend: define columns/rendering explicitly
    • Pros:
      • clearest contract, easiest to reason about
    • Cons:
      • slowest to iterate and more boilerplate
  2. Server-defined table
    • Backend: send columns + how to render them (ui classes) + data
    • Frontend: render whatever the backend sends
    • Pros: fast to iterate, backend can change table shape alone
    • Cons: weakest contract, harder to reason about final table shape
  3. Frontend-defined table
    • Backend:
      • listen for requests for which metrics to send + common metrics (server type, address, timestamp)
    • Frontend:
      • define columns/order/labels/rendering explicitly
      • request specific metrics via their ID (name)
    • Pros: page shape stays explicit, fast iteration, simpler than server defined tables, better separate of view vs data
    • Cons: metric IDs (names) become part of the frontend contract

To me frontend defined seems best. Others opinions or ideas would help

@dlmarion
Copy link
Copy Markdown
Contributor Author

dlmarion commented Apr 3, 2026

One issue, for me at least, is that I'm not a UI developer and I don't have to modify the Monitor front-end code very often. There is no documentation on how it works generally and what things in the various files need to be modified to make a simple change. Every time I have had to modify the Monitor front-end code it's days of re-learning how it works to do something simple. If I never had to touch the Monitor UI code again it would be ok with me. Unfortunately, that's not realistic.

For those reasons I would lean toward a more generic front-end solution that doesn't require a modification every time a column needs to be added or removed from a table. I understand your comments and viewpoints when looking at it from the perspective of someone with more experience doing UI development. I'm more concerned with the time it takes to modify the Monitor than anything else really.

As for this PR, my main goal is to see all the possible data in the tables during the development of 4.0 so that we can determine which columns belong and which ones don't. In the end I don't really care which approach is taken to achieve that goal except for my comments above about the maintenance burden.

@DomGarguilo
Copy link
Copy Markdown
Member

I am fine with backend-defined. Here is what I am thinking might work well (might be similar to what you have here)

  • each page defines an explicity list of Metric which the backend turns into a list of table columns
  • the backend would send column definitions, rows, status
  • every server type probably has the same common identity columns (server type, RG, last contact). So for every server page we would send those identity columns + List<Metric> then helpers would transform those into column labels, renderer hint (something like string, timestamp, count) and the row values
    • to get that to work we would either need to keep the mapping class added in this PR or add to Metric which might be cleaner and we could do it in a way that doesn't explicitly tie Metric to our frontend UI. I think we wouid just need to add two fields to Metric something like displayName (the friendly column header) and valueType (could be an enum like NUMBER, BYTES, DURATION, TIMESTAMP, etc.) which would be used to tell the front end how to render that data

The JSON shape to the front end might end up being something like:

  {
    "columns": [
      { "key": "server", "label": "Server", "type": "string" },
      { "key": "resourceGroup", "label": "Resource Group", "type": "string" },
      { "key": "lastContact", "label": "Last Contact", "type": "timestamp" },
      { "key": "accumulo.scan.start", "label": "Scan Start Count", "type": "number" }
    ],
    "rows": [
      {
        "server": "localhost:1234",
        "resourceGroup": "default",
        "lastContact": 1712151000000,
        "accumulo.scan.start": 42
      },
    ],
    "status": { whatever info we might need for the status indicators }
  }

Like I said I think this PR shares a lot of the same ideas and intent. I just wanted to express what I was thinking in case it differs and might be helpful.

@keith-turner
Copy link
Copy Markdown
Contributor

Overall this seems like a good concept to me.

Tried running cce9b3e locally and it would not build because of changes to accumulo access. To get around the build problem, cherry picked e9cc35a locally and then it built. When I looked at the rest endpoint rest-v2/sservers/view I saw the following, posting this to compare w/ what @DomGarguilo suggested.

{
  "data": [
    {
      "Last Contact": 1775490009993,
      "Resource Group": "default",
      "Server Address": "localhost:9700",
      "Server Type": "SCAN_SERVER"
    }
  ],
  "columns": [
    {
      "name": "Last Contact",
      "description": "Server last contact time",
      "uiClass": "timestamp"
    },
    {
      "name": "Resource Group",
      "description": "Resource Group",
      "uiClass": ""
    },
    {
      "name": "Server Address",
      "description": "Server address",
      "uiClass": ""
    },
    {
      "name": "Server Type",
      "description": "Type of server",
      "uiClass": ""
    }
  ],
  "status": {
    "hasScanServers": true,
    "hasProblemScanServers": false,
    "hasMissingMetrics": false,
    "scanServerCount": 1,
    "problemScanServerCount": 0,
    "missingMetricServerCount": 0,
    "level": "OK",
    "message": null
  },
  "timestamp": 1775490012993
}

We do not need to display the server type column of scan server on the scan server page.

I am fine with backend-defined. Here is what I am thinking might work well (might be similar to what you have here)

@DomGarguilo, the json you suggested seems a bit more generic and feels like it would work w/ a wide range of tables to display. Seems like it might be a good improvement. Would we be able to have a single javascript function that all code could use to take this generic json and create a HTML table? Wondering what this will look like on the UI side to make it work.

What is the display intent of the json status section in the UI? Like how should it be displayed relative to the table of data? When looking at the scan server page I did not see this data displayed.

var storedColumns = getStoredColumns();
$.each(storedColumns, function (index, col) {
//console.log('Adding column: ' + JSON.stringify(col));
var th = $(document.createElement("th"));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like its manually building HTML in javascript instead of using datatables? Does this mean we lose features of data tables like sorting, search, and pagination? Or this still using datatables? I looked that the monitor page for scan servers w/ these changes and it seems to still have sorting, but I only had a single scan server so not sure if its still working.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The intent of this PR is to make the monitor page dynamic and responsive to the columns that are returned by the server. No functionality is lost here that I know of, but the HTML columns for the table are created in the javascript instead of the .ftl file. If you turn metrics on, you'll see about 3 dozen metrics coming back from the Scan Server. There are maybe about 1 dozen metrics shown in the UI today.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No functionality is lost here that I know of, but the HTML columns for the table are created in the javascript instead of the .ftl file.

Ah ok, that was very helpful to know.

If you turn metrics on, you'll see about 3 dozen metrics coming back from the Scan Server.

I did not try that, I will give that a try and see what it looks like.

Copy link
Copy Markdown
Contributor Author

@dlmarion dlmarion Apr 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not try that, I will give that a try and see what it looks like.

Well, it currently looks terrible, but there is a good bit of information available to display on the Monitor that we aren't currently displaying. That's the point of this change, discovering what information is available to the Monitor so that we can decide what to display and what not to display.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants